AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Papasotiropoulos, Georgios, Pishbin, Zein

Fairness in the Multi-Secretary Problem

arXiv.org Artificial IntelligenceDec-1-2025

This paper bridges two perspectives: it studies the multi-secretary problem through the fairness lens of social choice, and examines multi-winner elections from the viewpoint of online decision making. After identifying the limitations of the prominent proportionality notion of Extended Justified Representation (EJR) in the online domain, the work proposes a set of mechanisms that merge techniques from online algorithms with rules from social choice -- such as the Method of Equal Shares and the Nash Rule -- and supports them through both theoretical analysis and extensive experimental evaluation.

artificial intelligence, machine learning, proceedings, (14 more...)

2511.23097

Country: Europe (0.46)

Genre: Research Report (0.63)

Industry: Government > Voting & Elections (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsOct-2-2025, 20:57:09 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

The dual problem is split into blocks of variables and each of these blocks is solved independently with an arbitrary optimizer. At the end of each iteration the (weighted) mean of the partial solutions is computed and shared among the nodes. The rate of convergence is analysed and it is shown that the algorithm converges with a linear rate when the used optimizer of the block does. Good results are shown on a set of problems. Quality: The paper is overall of high quality. The proofs appear to be correct and the experiments are reasonably well executed. However, I would have liked to see a few more results: On the one hand there is an effect of how the data points are distributed among the nodes and on the other hand there is a strong dependency on the strength of regularization. With small regularization, a big H will lead to a higher conflict between solutions, such that the convergence rate might be very small and a lot of computation is wasted.

algorithm, convergence rate, theorem 3, (13 more...)

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence (0.95)

Chen, Mayee F., Hu, Michael Y., Lourie, Nicholas, Cho, Kyunghyun, Ré, Christopher

Aioli: A Unified Optimization Framework for Language Model Data Mixing

arXiv.org Machine LearningNov-8-2024

Language model performance depends on identifying the optimal mixture of data groups to train on (e.g., law, code, math). Prior work has proposed a diverse set of methods to efficiently learn mixture proportions, ranging from fitting regression models over training runs to dynamically updating proportions throughout training. Surprisingly, we find that no existing method consistently outperforms a simple stratified sampling baseline in terms of average test perplexity per group. In this paper, we study the cause of this inconsistency by unifying existing methods into a standard optimization framework. We show that all methods set proportions to minimize total loss, subject to a method-specific mixing law -- an assumption on how loss is a function of mixture proportions. We find that existing parameterizations of mixing laws can express the true loss-proportion relationship empirically, but the methods themselves often set the mixing law parameters inaccurately, resulting in poor and inconsistent performance. Finally, we leverage the insights from our framework to derive a new online method named Aioli, which directly estimates the mixing law parameters throughout training and uses them to dynamically adjust proportions. Empirically, Aioli outperforms stratified sampling on 6 out of 6 datasets by an average of 0.28 test perplexity points, whereas existing methods fail to consistently beat stratified sampling, doing up to 6.9 points worse. Moreover, in a practical setting where proportions are learned on shorter runs due to computational constraints, Aioli can dynamically adjust these proportions over the full training run, consistently improving performance over existing methods by up to 12.01 test perplexity points.

ioli, proportion, val, (16 more...)

arXiv.org Machine Learning

2411.05735

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Jordan (0.04)
(13 more...)

Genre: Research Report (0.83)

Industry:

Information Technology (0.67)
Semiconductors & Electronics (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Neural Information Processing SystemsOct-10-2024, 10:05:43 GMT

An Online Method for A Class of Distributionally Robust Optimization with Non-convex Objectives

distributionally robust optimization, non-convex objective, online method, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Das, Bishwadeep, Isufi, Elvin

Online Graph Filtering Over Expanding Graphs

arXiv.org Artificial IntelligenceSep-11-2024

Graph filters are a staple tool for processing signals over graphs in a multitude of downstream tasks. However, they are commonly designed for graphs with a fixed number of nodes, despite real-world networks typically grow over time. This topological evolution is often known up to a stochastic model, thus, making conventional graph filters ill-equipped to withstand such topological changes, their uncertainty, as well as the dynamic nature of the incoming data. To tackle these issues, we propose an online graph filtering framework by relying on online learning principles. We design filters for scenarios where the topology is both known and unknown, including a learner adaptive to such evolution. We conduct a regret analysis to highlight the role played by the different components such as the online algorithm, the filter order, and the growing graph model. Numerical experiments with synthetic and real data corroborate the proposed approach for graph signal inference tasks and show a competitive performance w.r.t. baselines and state-of-the-art alternatives.

attachment, graph, incoming node, (15 more...)

2409.07204

Country:

Europe > Netherlands > South Holland > Delft (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.68)
Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)

Neural Information Processing SystemsMar-13-2024, 17:01:23 GMT

6081594975a764c8e3a691fa2b3a321d-Reviews.html

This paper proposes a new boosting method that represents a tradeoff between online and offline learning. The main idea of the method is to maintain a reservoir of training examples (of fixed size) from which to train the weak learners. At each boosting iteration, new examples are added to the reservoir and then a selection strategy is used to reduce the reservoir to its original fixed size before the weak learner is trained. Several naive selection strategies are proposed but the main contribution of the paper is a more sophisticated selection strategy whose goal is to remove examples from the reservoir so that a weak learner trained on the reduced set will minimize the error computed on the whole set before reduction. The resulting algorithm is applied on four computer vision datasets, where it is shown to outperform several other online boosting methods. The idea of using a reservoir is original and very interesting.

algorithm, reservoir, weak learner, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)

Ruiz-Moreno, Emilio, López-Ramos, Luis Miguel, Beferull-Lozano, Baltasar

A Trainable Approach to Zero-delay Smoothing Spline Interpolation

arXiv.org Artificial IntelligenceAug-20-2023

The task of reconstructing smooth signals from streamed data in the form of signal samples arises in various applications. This work addresses such a task subject to a zero-delay response; that is, the smooth signal must be reconstructed sequentially as soon as a data sample is available and without having access to subsequent data. State-of-the-art approaches solve this problem by interpolating consecutive data samples using splines. Here, each interpolation step yields a piece that ensures a smooth signal reconstruction while minimizing a cost metric, typically a weighted sum between the squared residual and a derivative-based measure of smoothness. As a result, a zero-delay interpolation is achieved in exchange for an almost certainly higher cumulative cost as compared to interpolating all data samples together. This paper presents a novel approach to further reduce this cumulative cost on average. First, we formulate a zero-delay smoothing spline interpolation problem from a sequential decision-making perspective, allowing us to model the future impact of each interpolated piece on the average cumulative cost. Then, an interpolation method is proposed to exploit the temporal dependencies between the streamed data samples. Our method is assisted by a recurrent neural network and accordingly trained to reduce the accumulated cost on average over a set of example data samples collected from the same signal source generating the signal to be reconstructed. Finally, we present extensive experimental results for synthetic and real data showing how our approach outperforms the abovementioned state-of-the-art.

artificial intelligence, machine learning, survey article, (20 more...)

2203.03776

Country:

Europe > Norway (0.14)
North America > United States (0.14)

Genre:

Research Report > Promising Solution (0.68)
Overview > Innovation (0.54)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Zhao, Puning, Lai, Lifeng

Minimax Optimal $Q$ Learning with Nearest Neighbors

arXiv.org Artificial IntelligenceAug-2-2023

$Q$ learning is a popular model free reinforcement learning method. Most of existing works focus on analyzing $Q$ learning for finite state and action spaces. If the state space is continuous, then the original $Q$ learning method can not be directly used. A modification of the original $Q$ learning method was proposed in (Shah and Xie, 2018), which estimates $Q$ values with nearest neighbors. Such modification makes $Q$ learning suitable for continuous state space. (Shah and Xie, 2018) shows that the convergence rate of estimated $Q$ function is $\tilde{O}(T^{-1/(d+3)})$, which is slower than the minimax lower bound $\tilde{\Omega}(T^{-1/(d+2)})$, indicating that this method is not efficient. This paper proposes two new $Q$ learning methods to bridge the gap of convergence rates in (Shah and Xie, 2018), with one of them being offline, while the other is online. Despite that we still use nearest neighbor approach to estimate $Q$ function, the algorithms are crucially different from (Shah and Xie, 2018). In particular, we replace the kernel nearest neighbor in discretized region with a direct nearest neighbor approach. Consequently, our approach significantly improves the convergence rate. Moreover, the time complexity is also significantly improved in high dimensional state spaces. Our analysis shows that both offline and online methods are minimax rate optimal.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2308.0149

Country:

North America > United States > California > Yolo County > Davis (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningJul-13-2020

A unified machine learning approach to time series forecasting applied to demand at emergency departments

Vollmer, Michaela A. C., Glampson, Ben, Mellan, Thomas A., Mishra, Swapnil, Mercuri, Luca, Costello, Ceire, Klaber, Robert, Cooke, Graham, Flaxman, Seth, Bhatt, Samir

There were 25.6 million attendances at Emergency Departments (EDs) in England in 2019 corresponding to an increase of 12 million attendances over the past ten years. The steadily rising demand at EDs creates a constant challenge to provide adequate quality of care while maintaining standards and productivity. Managing hospital demand effectively requires an adequate knowledge of the future rate of admission. Using 8 years of electronic admissions data from two major acute care hospitals in London, we develop a novel ensemble methodology that combines the outcomes of the best performing time series and machine learning approaches in order to make highly accurate forecasts of demand, 1, 3 and 7 days in the future. Both hospitals face an average daily demand of 208 and 106 attendances respectively and experience considerable volatility around this mean. However, our approach is able to predict attendances at these emergency departments one day in advance up to a mean absolute error of +/- 14 and +/- 10 patients corresponding to a mean absolute percentage error of 6.8% and 8.6% respectively. Our analysis compares machine learning algorithms to more traditional linear models. We find that linear models often outperform machine learning methods and that the quality of our predictions for any of the forecasting horizons of 1, 3 or 7 days are comparable as measured in MAE. In addition to comparing and combining state-of-the-art forecasting methods to predict hospital demand, we consider two different hyperparameter tuning methods, enabling a faster deployment of our models without compromising performance. We believe our framework can readily be used to forecast a wide range of policy relevant indicators.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

2007.06566

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Oceania > Australia > Queensland (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London > City of Westminster (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.96)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)